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Boltzmann's principle is used to select the "most probable" realization (macrostate) of an isolated 
or closed thermodynamic system, containing a small number of particles (N <C oo), for both classical 
and quantum statistics. The inferred probability distributions provide the means to define intensive 
variables and construct thermodynamic relationships for small microcanonical systems, which do 
not satisfy the thermodynamic limit. This is of critical importance to nanoscience and quantum 
technology. 
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INTRODUCTION 



In 1877, Boltzmann pQ discovered the combinatorial 
basis of entropy, usually expressed as [2]: 
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where S to tai is the total thermodynamic entropy of a sys- 
tem, k is the Boltzmann constant and W the statistical 
weight, i.e. the number of ways in which a given realiza- 
tion (macrostate) of the system can occur, as defined by 
the number of particles rii in each category i = 1, s of 
the system. Eq. can be rewritten in the dimensionless 
form: 
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where H is the dimensionless entropy per particle and N 
is the (actual) number of particles [3]. Eq. (p| extends 
naturally to the probabilistic definition [3J HI EJ: 
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where D is the divergence or cross-entropy of the sys- 
tem, per unit particle, and P = P({rii}\{qi}, N) is the 
probability of occurrence of a given realization, subject 
to N and the source distributions ( "prior probabilities" ) 
qt of each category. Maximisation of H (MaxEnt) or 
minimisation of D (MinXEnt), subject to its constraints, 
therefore selects the realization of highest weight W or 
probability P, a technique which can be termed the max- 
imum probability principle (MaxProb) [1, 2, 3Jffl[5]- The 
inferred distribution is then used to represent the system. 
Typically, W or P are considered to follow the multino- 
mial weight or distribution: 
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obtained cither by a "frequentist" model of the system, 
or by a Bayesian inferential method involving a weighted 
sum of all possible models [3j[6]. In these cases, H and 
D converge respectively in the asymptotic limit N — > oo 
(by the Sanov theorem [7]) to the Shannon [18 entropy or 
Kullback-Leibler [S] cross-entropy functions: 
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where Pi — n,/iV is the frequency or probability of occu- 
pancy of the ith category. This provides a probabilistic 
justification for these functions (based on W or P), in- 
dependent of the standard axiomatic derivations given 
in information theory j8j [10] . The cross-entropy ^ or 
([7]) contains the source distributions and is thus more 
general than the entropy ^ or ([6]); in thermodynamics, 
this is often handled by taking q t — gt/G, where gi is 
the number of distinguishable subcategories in category 
% (its degeneracy) and G = J2i=i Si- 
lt is of interest to consider systems of small numbers of 
particles N <^ 00, of critical importance in nanoscience, 
biochemistry and quantum technology Such systems will 
not satisfy the "thermodynamic limit" , in which both N 
and the system volume tend to infinity whilst the par- 
ticle density remains constant [HI Q2J [13]. For such a 
system, is it possible to infer a representative distribu- 
tion of particles amongst its categories? Inference using 
^ or Q is not possible, since these require the asymp- 
totic limit N — > 00. From the above discussion, it is clear 
that inference must proceed by the MaxProb principle - 
involving extremisation of the entropy or cross-entropy 
defined by - since this invokes a simple proba- 

bilistic proposition, which is independent of (indeed, it 
defines) thermodynamic concepts. Although the inferred 
distribution will not be as dominant as in the asymp- 
totic case - i.e. the "most probable" will not be the "only 
observable" distribution - this should not deter us from 
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FIG. 1: Non-asymptotic analyses of (a) an isolated system, 
and (b) a closed (energy-diffusive) system. 



conducting this analysis, nor prevent us from enlarging 
the body of thermodynamics to explore the effect of A. 

The aim of this study is to initiate a new non- 
asymptotic formulation of thermodynamics for small A 
systems, based exclusively on the MaxProb principle. 
This differs markedly from ensemble-based formulations 
for small systems, such as by Hill |12j . The new ap- 
proach has the advantages of a strong foundation in prob- 
ability theory, and being directly applicable to individual 
systems of particles rather than ensembles of systems, 
both within and beyond the domain of thermodynam- 
ics. In turn, we examine an isolated (§2) and a closed 
(energy-diffusive) (§3) small system, subject to an en- 
ergy constraint. The analyses are related, asymptotically, 
to the microcanonical ensemble and certain features of 
the canonical ensemble. Conclusions are drawn concern- 
ing the system temperature and zeroth law of thermo- 
dynamics. In §4, the analysis is extended to quantum 
systems governed by Bose-Einstein (BE) or Fermi-Dirac 
(FD) statistics. The results have important implications 
for the thermodynamics of small systems. 



ISOLATED SYSTEMS 

We first consider a single, isolated system of A < oo 
particles enclosed by a particle- and energy-impermablc 
wall, of constant total energy Et, as shown in Figure 
[TJl. The particles are distributed amongst energy lev- 
els Ci,i = 1, ...,s, of source probabilities qi. In the first 
instance, the filling of particles in levels is assumed to 
follow classical multinomial statistics §5§ (n.b. quantum 



statistics are considered in §). The system is therefore 
the non-asymptotic form of the microcanonical ensem- 
ble. From the MaxProb principle ([3]), the cross-entropy 
is: 
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which is subject to the constraints: 
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J2niei = E T = (E)N. (10) 
»=i 

where (E) is the mean energy per particle. Applying the 
calculus of variations to (|8j>- ( 10 ) gives the Lagrangian: 
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where Kq and k\ are Lagrangian multipliers associated 
with constraints (|9^-(flC)|, and the leading In A! term is 
brought inside the sum using The extremum of fll2| ), 
defined by 6L = 0, gives dL/drii — 0,Vi for constant A; 
this yields the most probable non-asymptotic distribution 
for the system: 
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where Xj = KjN for j 6 {0, 1} are modified Lagrangian 
multipliers, whilst A _1 (y) = ijj^ 1 {y — 1) is the upper in- 
verse of the function A(x) = ijj(x + 1), wherein ip(x) is 
the digamma function. Note (12 1 is also obtained if one 
extremises In P instead of D. There is no factorisable par- 



tition function, hence ( 12 1 must be solved simultaneously 
with both constraints |9^-(10l. 

In the asymptotic limit A — -> oo (e.g. Sanov's [7] theo- 
rem), extremisation of the Kullback-Leibler function ^ 
subject to ([£])-( 10 1 yields the Boltzmann distribution: 
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1 . As a corollary, ([T2| must converge 



to ( 13 1 as A — * oo. 



The character of a non-asymptotic system can be il- 
lustrated by two examples. Firstly, consider a system 
of A particles with three energy levels of degeneracies 
g = [1,4,9], subject only to the natural constraint (|9j. 
The probabilities P of each realization [m, 112, TI3], calcu- 
lated by ([5]) for different A using a combinatorial search 
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scheme, are illustrated by "mortarboard plots" against 
n\ and n-i in Figures J2h-d. The inferred non-asymptotic 
( 12 1 and asymptotic ( |13| distributions are also shown. As 
evident, for large A (Figure (2ji), the set of realizations 



is highly concentrated around the asymptotic distribu- 
tion, and the non-asymptotic distribution converges to 
this peak. However, for small N (Figures [2^-b) , the real- 
izations are more infrequent, with the system less domi- 
nated by its most probable realization. At low N, the in- 
ferred non-asymptotic distribution also becomes distinct 
from the asymptotic; moreover, due to the quantisation 
of levels, it may not coincide with the true most probable 
realization (the highest peak), but may lie within some 
neighbourhood of it. Even in this distinctly discrete case, 

being the predicted most 
can be used for fur- 



the predicted distribution ( 12 1 



probable distribution of the system ■ 
ther inference about the system. 

Secondly, consider the above multinomial system sub- 
ject also to the energy constraint ( [To] ) (E) = | or 2, for 
which plots of each realization for various N are given re- 
spectively in Figures S2-S3 in the Supplementary Data. 
The energy levels are taken as e = [1,2,4]. One exam- 
ple is shown in Figure [3^i. Since the energy constraint 
excludes many realizations, it is necessary to renormalisc 
the probabilities calculated using ([5J. Further plots, to 
show the effect of (E) at A = 729, are given in Figure 
S4; note that in this system, an asymptotic uniform dis- 
tribution (Ai = or infinite temperature) corresponds to 
(E) = 3.214. As evident, imposing the energy constraint 
causes a dramatic reduction ("pruning") in the num- 
ber of available realizations. This substantially expands 
the non-asymptotic domain (to much higher N 5; 1000 
to 5000), widening the separation between the inferred 
asymptotic and non-asymptotic distributions, and in- 
creasing the effect of quantisation. Plots of the effect 
of A and (E) on the non-asymptotic multipliers Ao and 
Ai are also given in Figure S5, of which one case is shown 
in Figure [3]d. In all cases, the Massieu function Ao con- 
verges to its known asymptotic value Xq L — 1 as N — > oo. 
Similarly, the inverse temperature Ai = 1/kT converges 
to a constant value (related to (E)) as N — > oo. Depend- 
ing on the value of (E), the multipliers could converge in 
either direction, i.e. from positions of lower or higher free 
energy and/or from lower or higher temperature. Several 
peaked convergence curves in Aq are also observed. 



CLOSED (ENERGY-DIFFUSIVE) SYSTEMS 

Now consider two systems in contact, as shown in Fig- 
ure [T]d, in which System 1 contains N% particles with 
energy levels Ci,i — 1, s of source distributions qi, 
whilst System 2 contains N 2 particles with energy levels 
= 1, of source distributions r,-; this could repre- 
sent two systems in contact, or a single system in contact 
with an energy bath. The double system is enclosed by an 



impermable wall containing total energy Et', within this, 
the systems make contact via a particle-impermeable but 
energy-permeable wall. For multinomial statistics, the 
probabilities of realization in System 1 and realiza- 
tion {rrij} in System 2 are respectively: 
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If there is no correlation between occupancies in Systems 
1 and 2, Pi and P2 are independent, whence the joint 
probability of the double realization {{ni}, {rrij}} is: 

s n - t mj 
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Note that in this case, it is necessary to analyse the dou- 
ble system using raw probabilistic principles, to ensure 
consistency of reasoning [3]; analysis using a predefined 
entropy or cross-entropy function (with its ./V" 1 divisor) 
can give incorrect results. Accordingly, we must extrem- 



ise the logarithm of (16 1, subject to the constraints: 
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where Ao a , Aot and Ai are Lagrangian multipliers asso- 



ciated with constraints ( 17 1-( 19 ) . The extremum SL = 



of ( 20 1 gives dL /drii — , Vi and dL/drrij — , Vj for con- 
stant N\ and N2, giving the inferred double distribution: 
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Again there are no factorable partition functions, hence 



( 21 l-(|22j) must be solved simultaneously with (17l-(19l 



The analysis is therefore constructed on a microcanon- 
ical basis (systems of particles rather than ensembles 
of systems), but shares some features of the canonical 
ensemble. From Sanov's [7] theorem, each distribution 
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Agb) as Ni -> oo or 7V 2 



The inferred distributions (21 1- ( 22 1 have important 
implications. Firstly, both share a common multiplier 
Ai, hence at equilibrium, Systems 1 and 2 are of identi- 
cal temperature T = l/(fcAi); this applies regardless of 
the number of particles iVi and N2 in each system (even 
for single-particle systems). The (statistical) validity of 
the zeroth law of thermodynamics in small systems is 
therefore upheld. Secondly, each distribution pf or ttJ 
depends on the number of particles in that system, but 
not on the number in the other system, except via the 
influence of the common temperature. This leads to the 
unsurprising conclusion that it is not necessary to im- 
pose the thermodynamic limit, or the existence of a bath 
containing an infinite number of particles N2 — > 00, for 
the concept of temperature to be valid. The precision 



of inference (e.g. the reproducibility of the equilibrium 
position) will be less pronounced than in the asymptotic 
case, due to reduced dominance of the most probable 
peak, but it nonetheless is meaningful. 



Returning to the examples in Figure 3 and S1-S5, it is 
seen that if two non-asymptotic multinomial subsystems, 
with the same energy level structure and degeneracy, are 
of common Ai, there must be an imbalance in the en- 
ergy per particle (E) between the two subsystems. This 
is equivalent to the statement that the two subsystems 
exhibit different heat capacities d(E)j/dT, j = 1, 2. The 
analysis therefore reveals an apparent paradox in the be- 
haviour of non-asymptotic multinomial systems at low 
N. This arises from forcing a small number of particles, 
with quantised energy levels, to adopt a configuration 
which matches the temperature of another subsystem, 
producing non-Boltzmann-like distributions of particles 
amongst energy levels and concomitant changes in the 
energy per particle of the subsystem. 
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FIG. 3: Example multinomial system subject to the natural 
and energy constraints with (E) = | (see text): (a) possible 
realizations for TV = 243; and (b) variation of Ao and Ai with 
TV. 



FIG. 4: Example BE system subject to the natural and energy 
constraints with (E) — | (see text): (a) possible realizations 
for TV = 243; and (b) variation of Ao and Ai with TV. 



QUANTUM SYSTEMS 

For completeness, we consider the non-asymptotic 
form of the degenerate Maxwell-Boltzmann (MB), Bose- 
Einstein (BE) and Fermi-Dirac (FD) statistics of quan- 
tum physics, for the isolated system in Figure [TJl. Al- 
though usually represented using weights [TTl IT51 HM 120) . 
from a MaxProb perspective they are more appropri- 
ately represented using the normalised probability dis- 
tributions of Brillouin EU [22]: 
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The MB statistic is identical to the multinomial ^ with 
1i — 9i/G. From the Boltzmann principle ([2]), the result- 
ing non-asymptotic BE and FD cross-entropy functions 
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Maximisation of P or minimisation of D for each case, 
subject to the constraints ([9])-(10l, yields the inferred 
non-asymptotic distributions: 
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These can be shown to reduce to the well-known asymp- 
totic distributions of BE and FD statistics [P7 1 fT8 l [20] 

as TV — ► oo. 

Several plots of the probability of each realization, for 
the same system examined in Figures 2][3 and S1-S5 but 
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now using the BE statistic ( 26 1 and ( 28 1 , are given in 
Figures S6-S9. One of these plots is shown in Figure 
|4^. Analysis using only the natural constraint (Figure 
S6) produces similar plots to the multinomial case, al- 
beit with a broader spread of realizations. If an energy 
constraint is considered, the resulting plots (Figures 
and S7-S9) display similar features to the multinomial 
system, in that non- asymptotic effects are extended to 
much higher N. Also, the asymptotic KL realization 



( 13 1 becomes less and less representative of the system 
oo. However, there is one important difference 



as N 



to the multinomial case: the BE distribution appears to 
converge asymptotically to a constant "envelope" of pos- 
sible realizations, rather than to a single, sharp peak. 
This implies that even in the asymptotic limit, there is 
considerable uncertainty in inferring the realization of the 
system, a result in sympathy with the known quantum 
behaviour of BE systems. Plots of numerical values of 
Ao and Ai are also given in Figures and S10; these 
reveal consistent convergence towards asymptotic values 
as N — > oo. 

Finally, as evident from the previous analysis (§3), in 
a closed double quantum system (Figure [TV)) the above 
distributions will also apply to each subsystem, with pj , 



gi, Ao and iV replaced respectively by 
X b and N 2 . 



# 



9i, A 0a and N r 



or wf, gj 



CONCLUSIONS 

The MaxProb principle (Boltzmann's principle) is used 
to determine the "most probable" realization (macro- 
state) of an isolated or closed thermodynamic system, 
containing a small number of particles (N < oo), both 
for classical and quantum statistics. The inferred distri- 
butions provide the means to define intensive variables 
and construct thermodynamic relationships in systems 
which do not satisfy the thermodynamic limit, using 
a particle-based rather than ensemble approach. The 
inferred distributions will become less reproducible as 
N — > 1, since the most probable peak will be less and 
less dominant, but still provide the best distribution for 
probabilistic inference. The analysis also reveals several 
peculiar properties of non-asymptotic systems, including 
a difference in the mean energy per particle between non- 
asymptotic subsystems in thermal equilibrium, and the 
asymptotic convergence of quantum (BE) systems to an 
"envelope" of possible realizations rather than a sharp 
peak. 

This study concerns systems with fixed N. Further 
work is required on the non-asymptotic behaviour of sys- 
tems with variable N (the grand canonical ensemble). 
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